UBC/SFU-Shieh-MC1

 

VAST 2012 Challenge
Mini-Challenge 1: Bank of Money Enterprise: Cyber Situation Awareness

 

 

Team Members:

 

Wen Kuang Benjamin Shieh, University of British Columbia, benjamin.shieh@alumni.ubc.ca PRIMARY

Student Team:  YES

 

Tool(s):

 

TABLEAU

MYSQL

Adobe Illustrator (For Mapping)

 

Video:

 

UBC-Shieh-MC1-Video.wmv

 

Answers to Mini-Challenge 1 Questions:

 

MC 1.1  Create a visualization of the health and policy status of the entire Bank of Money enterprise as of 2 pm BMT (BankWorld Mean Time) on February 2. What areas of concern do you observe? 

 

The goal here was to create a single visualization that fits onto one screen, a dashboard, that allows someone to easily see and explore the overall health and status of the network. Exploration is facilitated by highlighting and filtering actions built into the dashboard. Since the overall majority of computers are completely healthy, defined as policy status equals 1 and activity flag equals 1, they are filtered out of the visualization except for the lower table. This removes noise from the visualization and allows one to focus on anomalies.

 

Figure 1. Overall health dashboard at 2pm with no highlighting or filtering.

In figure 1, we can see that overall, the majority of computers with deviated policy statuses are servers and that they are mostly at policy status 2. For the most part the system is completely healthy (~80%).

 

Figure 2. Overall health dashboard at 2pm highlighting policy status 5.

Here we see that there is only one computer reporting a policy status 5, a possible virus detected. This is at Datacenter 2, HQ.

 

Figure 3. Overall health dashboard at 2pm highlighting Region 5 and 10.

These two Regions stand out from the rest because no computers there are reporting a policy status of 1. Almost all computers are reporting a policy status of 2 and some higher.

 

MC 1.2  Use your visualization tools to look at how the network’s status changes over time. Highlight up to five potential anomalies in the network and provide a visualization of each. When did each anomaly begin and end? What might be an explanation of each anomaly?

 

Anomaly 1

Here we explore findings from MC 1.1 Figure 3. Using the whole Health Time range we can see that Region 5 and 10 starts with no completely healthy computers reporting and that this anomaly continues to the end of reporting. We can compare this to a typical region using Region 1 as an example. Region 1 starts with almost all computers reporting as completely healthy instead. Although Region 5 and 10 start with all computers reporting at policy status 2 the rate of increase of reporting of policy statuses 3, 4, and 5, are the same as that of a typical region.

 

Figure 4. Policy status distribution by region. Right chart shows all policy statuses while the left shows only 4 and 5.

 

As seen from MC 1.1 Figure 2, there was one computer reporting policy status 5 at Data center 2. This is seen right from the start of reporting. Let's keep these findings in mind while looking at anomaly 2.

 

Anomaly 2

Here we explore Datacenter 5 that is situated in Region 10. While looking at the number of IP addresses reporting overall we see that there is a large spike at 2 Feb., 6:00 PM. Digging down we look at the number of IP addresses reporting per time zone and region. We see that Business Unit Headquarters shows an abnormal spike in reporting in Time Zone -6 that is not seen in any other region or time zone. We then see that this only occurs in the Facility Datacenter 5. There is very little reporting since the start of reporting with spikes in increases of the number of IPs reporting at 2:30 PM, 6:00 PM, and 7:00 PM. By 7:15 PM the number of IPs reporting matches that of other Datacenters. We use Datacenter 2 as a comparison. Remember that Datacenter 2 was the location of the first detected Policy Status 5. Whats interesting here is that after returning to reporting normalcy we see that the rates of change of Policy Status mimics that of the other Datacenters. Furthermore, we see that at 7:15 PM the level of IPs reporting reached is the near the same level as in the other Datacenters. It's as if this Datacenter was experiencing the same deterioration in Policy Status as the other Datacenters but for some reason was just not reporting.

 

Figure 5. A. Number of IP Addresses Reporting per Time Zone and Selective Regions. B. Number of IP Addresses Reporting Overall. C. Policy Status of Data Centers 2 and 5.

 

Figure 6. Activity Flag of Data Centers 2 and 5.

 

We also see a similar pattern when looking at reported Activity Flags. That is, abnormally low numbers reporting at start, then spiked increases, then normal reporting with similar patterns to other Datacenters. These findings, coupled with the findings in Anomaly 1, strongly suggest that the system was already infected by malicious software that degrades system health at some indeterminate time before reporting started.

 

Anomaly 3

Now that we have established where the first Policy Status 5 was detected and other problematic areas we should now look at how Policy Status 5 propagates through the network.

 

Figure 7. A. All reported policy status over the whole period. B. Map of policy status 5 spread. C. Percent policy status reported by region. D. Percent policy status reported by machine class and function.

 

In Figure 7 A, we can see that the network experiences a gradual deterioration in policy status. The bumps can be attributed to the turning on and off of workstations that do not report in the early mornings. In B. we can see that the spread is throughout the system; no regions are unaffected. The spread did not seem to have a specific pattern as seen using the Page function in Tableau that can't be seen in the snapshot but is shown in the video. In C, we see that policy statuses are split in roughly the same proportions throughout the different regions with the exception of 5 and 10. Those regions do not have any Policy Status 1's reporting but the number of Policy Status 2's seem to be the addition of PS=1 and PS=2 in a "normal" region. In D, we see that Policy Status distributions are proportional across all machine class and machine functions. These findings strongly suggest that the malicious software affecting the network is not specifically targeting any particular characteristic of the network.

 

Acknowledgements

Thanks to the Vancouver Institute of Visual Analytics (VIVA) and the MAGIC lab at UBC for providing access to tools and lab space.